EMNLP.2019 - System Demonstrations | Cool Papers

#1 ABSApp: A Portable Weakly-Supervised Aspect-Based Sentiment Extraction System [PDF] [Copy] [Kimi¹]

Authors: Oren Pereg ; Daniel Korat ; Moshe Wasserblat ; Jonathan Mamou ; Ido Dagan

We present ABSApp, a portable system for weakly-supervised aspect-based sentiment ex- traction. The system is interpretable and user friendly and does not require labeled training data, hence can be rapidly and cost-effectively used across different domains in applied setups. The system flow includes three stages: First, it generates domain-specific aspect and opinion lexicons based on an unlabeled dataset; second, it enables the user to view and edit those lexicons (weak supervision); and finally, it enables the user to select an unlabeled target dataset from the same domain, classify it, and generate an aspect-based sentiment report. ABSApp has been successfully used in a number of real-life use cases, among them movie review analysis and convention impact analysis.

#2 AllenNLP Interpret: A Framework for Explaining Predictions of NLP Models [PDF] [Copy] [Kimi]

Authors: Eric Wallace ; Jens Tuyls ; Junlin Wang ; Sanjay Subramanian ; Matt Gardner ; Sameer Singh

Neural NLP models are increasingly accurate but are imperfect and opaque—they break in counterintuitive ways and leave end users puzzled at their behavior. Model interpretation methods ameliorate this opacity by providing explanations for specific model predictions. Unfortunately, existing interpretation codebases make it difficult to apply these methods to new models and tasks, which hinders adoption for practitioners and burdens interpretability researchers. We introduce AllenNLP Interpret, a flexible framework for interpreting NLP models. The toolkit provides interpretation primitives (e.g., input gradients) for any AllenNLP model and task, a suite of built-in interpretation methods, and a library of front-end visualization components. We demonstrate the toolkit’s flexibility and utility by implementing live demos for five interpretation methods (e.g., saliency maps and adversarial attacks) on a variety of models and tasks (e.g., masked language modeling using BERT and reading comprehension using BiDAF). These demos, alongside our code and tutorials, are available at https://allennlp.org/interpret.

#3 ALTER: Auxiliary Text Rewriting Tool for Natural Language Generation [PDF¹] [Copy] [Kimi¹]

Authors: Qiongkai Xu ; Chenchen Xu ; Lizhen Qu

In this paper, we describe ALTER, an auxiliary text rewriting tool that facilitates the rewriting process for natural language generation tasks, such as paraphrasing, text simplification, fairness-aware text rewriting, and text style transfer. Our tool is characterized by two features, i) recording of word-level revision histories and ii) flexible auxiliary edit support and feedback to annotators. The text rewriting assist and traceable rewriting history are potentially beneficial to the future research of natural language generation.

#4 Applying BERT to Document Retrieval with Birch [PDF] [Copy] [Kimi¹]

Authors: Zeynep Akkalyoncu Yilmaz ; Shengjin Wang ; Wei Yang ; Haotian Zhang ; Jimmy Lin

We present Birch, a system that applies BERT to document retrieval via integration with the open-source Anserini information retrieval toolkit to demonstrate end-to-end search over large document collections. Birch implements simple ranking models that achieve state-of-the-art effectiveness on standard TREC newswire and social media test collections. This demonstration focuses on technical challenges in the integration of NLP and IR capabilities, along with the design rationale behind our approach to tightly-coupled integration between Python (to support neural networks) and the Java Virtual Machine (to support document retrieval using the open-source Lucene search library). We demonstrate integration of Birch with an existing search interface as well as interactive notebooks that highlight its capabilities in an easy-to-understand manner.

#5 Automatic Taxonomy Induction and Expansion [PDF] [Copy] [Kimi]

Authors: Nicolas Rodolfo Fauceglia ; Alfio Gliozzo ; Sarthak Dash ; Md. Faisal Mahbub Chowdhury ; Nandana Mihindukulasooriya

The Knowledge Graph Induction Service (KGIS) is an end-to-end knowledge induction system. One of its main capabilities is to automatically induce taxonomies from input documents using a hybrid approach that takes advantage of linguistic patterns, semantic web and neural networks. KGIS allows the user to semi-automatically curate and expand the induced taxonomy through a component called Smart SpreadSheet by exploiting distributional semantics. In this paper, we describe these taxonomy induction and expansion features of KGIS. A screencast video demonstrating the system is available in https://ibm.box.com/v/emnlp-2019-demo .

#6 CFO: A Framework for Building Production NLP Systems [PDF] [Copy] [Kimi¹]

Authors: Rishav Chakravarti ; Cezar Pendus ; Andrzej Sakrajda ; Anthony Ferritto ; Lin Pan ; Michael Glass ; Vittorio Castelli ; J. William Murdock ; Radu Florian ; Salim Roukos ; Avi Sil

This paper introduces a novel orchestration framework, called CFO (Computation Flow Orchestrator), for building, experimenting with, and deploying interactive NLP (Natural Language Processing) and IR (Information Retrieval) systems to production environments. We then demonstrate a question answering system built using this framework which incorporates state-of-the-art BERT based MRC (Machine Reading Com- prehension) with IR components to enable end-to-end answer retrieval. Results from the demo system are shown to be high quality in both academic and industry domain specific settings. Finally, we discuss best practices when (pre-)training BERT based MRC models for production systems. Screencast links: - Short video (< 3 min): http: //ibm.biz/gaama_demo - Supplementary long video (< 13 min): http://ibm.biz/gaama_cfo_demo

#7 Chameleon: A Language Model Adaptation Toolkit for Automatic Speech Recognition of Conversational Speech [PDF] [Copy] [Kimi¹]

Authors: Yuanfeng Song ; Di Jiang ; Weiwei Zhao ; Qian Xu ; Raymond Chi-Wing Wong ; Qiang Yang

Language model is a vital component in modern automatic speech recognition (ASR) systems. Since “one-size-fits-all” language model works suboptimally for conversational speeches, language model adaptation (LMA) is considered as a promising solution for solving this problem. In order to compare the state-of-the-art LMA techniques and systematically demonstrate their effect in conversational speech recognition, we develop a novel toolkit named Chameleon, which includes the state-of-the-art cache-based and topic-based LMA techniques. This demonstration does not only vividly visualize underlying working mechanisms of a variety of the state-of-the-art LMA models but also provide an interface for the user to customize the hyperparameters of them. With this demonstration, the audience can experience the effect of LMA in an interactive and real-time fashion. We wish this demonstration would inspire more research on better language model techniques for ASR.

#8 Controlling Sequence-to-Sequence Models - A Demonstration on Neural-based Acrostic Generator [PDF] [Copy] [Kimi¹]

Authors: Liang-Hsin Shen ; Pei-Lun Tai ; Chao-Chung Wu ; Shou-De Lin

An acrostic is a form of writing that the first token of each line (or other recurring features in the text) forms a meaningful sequence. In this paper we present a generalized acrostic generation system that can hide certain message in a flexible pattern specified by the users. Different from previous works that focus on rule-based solutions, here we adopt a neural- based sequence-to-sequence model to achieve this goal. Besides acrostic, users are also allowed to specify the rhyme and length of the output sequences. Based on our knowledge, this is the first neural-based natural language generation system that demonstrates the capability of performing micro-level control over output sentences.

#9 EASSE: Easier Automatic Sentence Simplification Evaluation [PDF] [Copy] [Kimi]

Authors: Fernando Alva-Manchego ; Louis Martin ; Carolina Scarton ; Lucia Specia

We introduce EASSE, a Python package aiming to facilitate and standardise automatic evaluation and comparison of Sentence Simplification (SS) systems. EASSE provides a single access point to a broad range of evaluation resources: standard automatic metrics for assessing SS outputs (e.g. SARI), word-level accuracy scores for certain simplification transformations, reference-independent quality estimation features (e.g. compression ratio), and standard test data for SS evaluation (e.g. TurkCorpus). Finally, EASSE generates easy-to-visualise reports on the various metrics and features above and on how a particular SS output fares against reference simplifications. Through experiments, we show that these functionalities allow for better comparison and understanding of the performance of SS systems.

#10 EGG: a toolkit for research on Emergence of lanGuage in Games [PDF] [Copy] [Kimi¹]

Authors: Eugene Kharitonov ; Rahma Chaabouni ; Diane Bouchacourt ; Marco Baroni

There is renewed interest in simulating language emergence among deep neural agents that communicate to jointly solve a task, spurred by the practical aim to develop language-enabled interactive AIs, as well as by theoretical questions about the evolution of human language. However, optimizing deep architectures connected by a discrete communication channel (such as that in which language emerges) is technically challenging. We introduce EGG, a toolkit that greatly simplifies the implementation of emergent-language communication games. EGG’s modular design provides a set of building blocks that the user can combine to create new games, easily navigating the optimization and architecture space. We hope that the tool will lower the technical barrier, and encourage researchers from various backgrounds to do original work in this exciting area.

#11 Entity resolution for noisy ASR transcripts [PDF] [Copy] [Kimi¹]

Authors: Arushi Raghuvanshi ; Vijay Ramakrishnan ; Varsha Embar ; Lucien Carroll ; Karthik Raghunathan

Large vocabulary domain-agnostic Automatic Speech Recognition (ASR) systems often mistranscribe domain-specific words and phrases. Since these generic ASR systems are the first component of most voice assistants in production, building Natural Language Understanding (NLU) systems that are robust to these errors can be a challenging task. In this paper, we focus on handling ASR errors in named entities, specifically person names, for a voice-based collaboration assistant. We demonstrate an effective method for resolving person names that are mistranscribed by black-box ASR systems, using character and phoneme-based information retrieval techniques and contextual information, which improves accuracy by 40.8% on our production system. We provide a live interactive demo to further illustrate the nuances of this problem and the effectiveness of our solution.

#12 EUSP: An Easy-to-Use Semantic Parsing PlatForm [PDF] [Copy] [Kimi¹]

Authors: Bo An ; Chen Bo ; Xianpei Han ; Le Sun

Semantic parsing aims to map natural language utterances into structured meaning representations. We present a modular platform, EUSP (Easy-to-Use Semantic Parsing PlatForm), that facilitates developers to build semantic parser from scratch. Instead of requiring a large amount of training data or complex grammar knowledge, in our platform developers can build grammar-based semantic parser or neural-based semantic parser through configure files which specify the modules and components that compose semantic parsing system. A high quality grammar-based semantic parsing system only requires domain lexicons rather than costly training data for a semantic parser. Furthermore, we provide a browser-based method to generate the semantic parsing system to minimize the difficulty of development. Experimental results show that the neural-based semantic parser system achieves competitive performance on semantic parsing task, and grammar-based semantic parsers significantly improve the performance of a business search engine.

#13 FAMULUS: Interactive Annotation and Feedback Generation for Teaching Diagnostic Reasoning [PDF] [Copy] [Kimi¹]

Authors: Jonas Pfeiffer ; Christian M. Meyer ; Claudia Schulz ; Jan Kiesewetter ; Jan Zottmann ; Michael Sailer ; Elisabeth Bauer ; Frank Fischer ; Martin R. Fischer ; Iryna Gurevych

Our proposed system FAMULUS helps students learn to diagnose based on automatic feedback in virtual patient simulations, and it supports instructors in labeling training data. Diagnosing is an exceptionally difficult skill to obtain but vital for many different professions (e.g., medical doctors, teachers). Previous case simulation systems are limited to multiple-choice questions and thus cannot give constructive individualized feedback on a student’s diagnostic reasoning process. Given initially only limited data, we leverage a (replaceable) NLP model to both support experts in their further data annotation with automatic suggestions, and we provide automatic feedback for students. We argue that because the central model consistently improves, our interactive approach encourages both students and instructors to recurrently use the tool, and thus accelerate the speed of data creation and annotation. We show results from two user studies on diagnostic reasoning in medicine and teacher education and outline how our system can be extended to further use cases.

#14 Gunrock: A Social Bot for Complex and Engaging Long Conversations [PDF] [Copy] [Kimi]

Authors: Dian Yu ; Michelle Cohn ; Yi Mang Yang ; Chun Yen Chen ; Weiming Wen ; Jiaping Zhang ; Mingyang Zhou ; Kevin Jesse ; Austin Chau ; Antara Bhowmick ; Shreenath Iyer ; Giritheja Sreenivasulu ; Sam Davidson ; Ashwin Bhandare ; Zhou Yu

Gunrock is the winner of the 2018 Amazon Alexa Prize, as evaluated by coherence and engagement from both real users and Amazon-selected expert conversationalists. We focus on understanding complex sentences and having in-depth conversations in open domains. In this paper, we introduce some innovative system designs and related validation analysis. Overall, we found that users produce longer sentences to Gunrock, which are directly related to users’ engagement (e.g., ratings, number of turns). Additionally, users’ backstory queries about Gunrock are positively correlated to user satisfaction. Finally, we found dialog flows that interleave facts and personal opinions and stories lead to better user satisfaction.

#15 HARE: a Flexible Highlighting Annotator for Ranking and Exploration [PDF] [Copy] [Kimi¹]

Authors: Denis Newman-Griffis ; Eric Fosler-Lussier

Exploration and analysis of potential data sources is a significant challenge in the application of NLP techniques to novel information domains. We describe HARE, a system for highlighting relevant information in document collections to support ranking and triage, which provides tools for post-processing and qualitative analysis for model development and tuning. We apply HARE to the use case of narrative descriptions of mobility information in clinical data, and demonstrate its utility in comparing candidate embedding features. We provide a web-based interface for annotation visualization and document ranking, with a modular backend to support interoperability with existing annotation tools. Our system is available online at https://github.com/OSU-slatelab/HARE.

#16 Honkling: In-Browser Personalization for Ubiquitous Keyword Spotting [PDF] [Copy] [Kimi¹]

Authors: Jaejun Lee ; Raphael Tang ; Jimmy Lin

Used for simple commands recognition on devices from smart speakers to mobile phones, keyword spotting systems are everywhere. Ubiquitous as well are web applications, which have grown in popularity and complexity over the last decade. However, despite their obvious advantages in natural language interaction, voice-enabled web applications are still few and far between. We attempt to bridge this gap with Honkling, a novel, JavaScript-based keyword spotting system. Purely client-side and cross-device compatible, Honkling can be deployed directly on user devices. Our in-browser implementation enables seamless personalization, which can greatly improve model quality; in the presence of underrepresented, non-American user accents, we can achieve up to an absolute 10% increase in accuracy in the personalized model with only a few examples.

#17 IFlyLegal: A Chinese Legal System for Consultation, Law Searching, and Document Analysis [PDF] [Copy] [Kimi¹]

Authors: Ziyue Wang ; Baoxin Wang ; Xingyi Duan ; Dayong Wu ; Shijin Wang ; Guoping Hu ; Ting Liu

Legal Tech is developed to help people with legal services and solve legal problems via machines. To achieve this, one of the key requirements for machines is to utilize legal knowledge and comprehend legal context. This can be fulfilled by natural language processing (NLP) techniques, for instance, text representation, text categorization, question answering (QA) and natural language inference, etc. To this end, we introduce a freely available Chinese Legal Tech system (IFlyLegal) that benefits from multiple NLP tasks. It is an integrated system that performs legal consulting, multi-way law searching, and legal document analysis by exploiting techniques such as deep contextual representations and various attention mechanisms. To our knowledge, IFlyLegal is the first Chinese legal system that employs up-to-date NLP techniques and caters for needs of different user groups, such as lawyers, judges, procurators, and clients. Since Jan, 2019, we have gathered 2,349 users and 28,238 page views (till June, 23, 2019).

#18 INMT: Interactive Neural Machine Translation Prediction [PDF] [Copy] [Kimi¹]

Authors: Sebastin Santy ; Sandipan Dandapat ; Monojit Choudhury ; Kalika Bali

In this paper, we demonstrate an Interactive Machine Translation interface, that assists human translators with on-the-fly hints and suggestions. This makes the end-to-end translation process faster, more efficient and creates high-quality translations. We augment the OpenNMT backend with a mechanism to accept the user input and generate conditioned translations.

#19 Joey NMT: A Minimalist NMT Toolkit for Novices [PDF] [Copy] [Kimi]

Authors: Julia Kreutzer ; Jasmijn Bastings ; Stefan Riezler

We present Joey NMT, a minimalist neural machine translation toolkit based on PyTorch that is specifically designed for novices. Joey NMT provides many popular NMT features in a small and simple code base, so that novices can easily and quickly learn to use it and adapt it to their needs. Despite its focus on simplicity, Joey NMT supports classic architectures (RNNs, transformers), fast beam search, weight tying, and more, and achieves performance comparable to more complex toolkits on standard benchmarks. We evaluate the accessibility of our toolkit in a user study where novices with general knowledge about Pytorch and NMT and experts work through a self-contained Joey NMT tutorial, showing that novices perform almost as well as experts in a subsequent code quiz. Joey NMT is available at https://github.com/joeynmt/joeynmt.

#20 Journalist-in-the-Loop: Continuous Learning as a Service for Rumour Analysis [PDF] [Copy] [Kimi¹]

Authors: Twin Karmakharm ; Nikolaos Aletras ; Kalina Bontcheva

Automatically identifying rumours in social media and assessing their veracity is an important task with downstream applications in journalism. A significant challenge is how to keep rumour analysis tools up-to-date as new information becomes available for particular rumours that spread in a social network. This paper presents a novel open-source web-based rumour analysis tool that can continuous learn from journalists. The system features a rumour annotation service that allows journalists to easily provide feedback for a given social media post through a web-based interface. The feedback allows the system to improve an underlying state-of-the-art neural network-based rumour classification model. The system can be easily integrated as a service into existing tools and platforms used by journalists using a REST API.

#21 LIDA: Lightweight Interactive Dialogue Annotator [PDF] [Copy] [Kimi]

Authors: Edward Collins ; Nikolai Rozanov ; Bingbing Zhang

Dialogue systems have the potential to change how people interact with machines but are highly dependent on the quality of the data used to train them. It is therefore important to develop good dialogue annotation tools which can improve the speed and quality of dialogue data annotation. With this in mind, we introduce LIDA, an annotation tool designed specifically for conversation data. As far as we know, LIDA is the first dialogue annotation system that handles the entire dialogue annotation pipeline from raw text, as may be the output of transcription services, to structured conversation data. Furthermore it supports the integration of arbitrary machine learning mod-els as annotation recommenders and also has a dedicated interface to resolve inter-annotator disagreements such as after crowdsourcing an-notations for a dataset. LIDA is fully open source, documented and publicly available.[https://github.com/Wluper/lida] –> Screen Cast: https://vimeo.com/329824847

#22 LINSPECTOR WEB: A Multilingual Probing Suite for Word Representations [PDF] [Copy] [Kimi¹]

Authors: Max Eichler ; Gözde Gül Şahin ; Iryna Gurevych

We present LINSPECTOR WEB , an open source multilingual inspector to analyze word representations. Our system provides researchers working in low-resource settings with an easily accessible web based probing tool to gain quick insights into their word embeddings especially outside of the English language. To do this we employ 16 simple linguistic probing tasks such as gender, case marking, and tense for a diverse set of 28 languages. We support probing of static word embeddings along with pretrained AllenNLP models that are commonly used for NLP downstream tasks such as named entity recognition, natural language inference and dependency parsing. The results are visualized in a polar chart and also provided as a table. LINSPECTOR WEB is available as an offline tool or at https://linspector.ukp.informatik.tu-darmstadt.de.

#23 MAssistant: A Personal Knowledge Assistant for MOOC Learners [PDF] [Copy] [Kimi¹]

Authors: Lan Jiang ; Shuhan Hu ; Mingyu Huang ; Zhichun Wang ; Jinjian Yang ; Xiaoju Ye ; Wei Zheng

Massive Open Online Courses (MOOCs) have developed rapidly and attracted large number of learners. In this work, we present MAssistant system, a personal knowledge assistant for MOOC learners. MAssistant helps users to trace the concepts they have learned in MOOCs, and to build their own concept graphs. There are three key components in MAssistant: (i) a large-scale concept graph built from open data sources, which contains concepts in various domains and relations among them; (ii) a browser extension which interacts with learners when they are watching video lectures, and presents important concepts to them; (iii) a web application allowing users to explore their personal concept graphs, which are built based on their learning activities on MOOCs. MAssistant will facilitate the knowledge management task for MOOC learners, and make the learning on MOOCs easier.

#24 MedCATTrainer: A Biomedical Free Text Annotation Interface with Active Learning and Research Use Case Specific Customisation [PDF] [Copy] [Kimi¹]

Authors: Thomas Searle ; Zeljko Kraljevic ; Rebecca Bendayan ; Daniel Bean ; Richard Dobson

An interface for building, improving and customising a given Named Entity Recognition and Linking (NER+L) model for biomedical domain text, and the efficient collation of accurate research use case specific training data and subsequent model training. Screencast demo available here: https://www.youtube.com/watch?v=lM914DQjvSo

#25 Memory Grounded Conversational Reasoning [PDF] [Copy] [Kimi¹]

Authors: Seungwhan Moon ; Pararth Shah ; Rajen Subba ; Anuj Kumar

We demonstrate a conversational system which engages the user through a multi-modal, multi-turn dialog over the user’s memories. The system can perform QA over memories by responding to user queries to recall specific attributes and associated media (e.g. photos) of past episodic memories. The system can also make proactive suggestions to surface related events or facts from past memories to make conversations more engaging and natural. To implement such a system, we collect a new corpus of memory grounded conversations, which comprises human-to-human role-playing dialogs given synthetic memory graphs with simulated attributes. Our proof-of-concept system operates on these synthetic memory graphs, however it can be trained and applied to real-world user memory data (e.g. photo albums, etc.) We present the architecture of the proposed conversational system, and example queries that the system supports.